67 research outputs found

    Par3DNet: Using 3DCNNs for Object Recognition on Tridimensional Partial Views

    Get PDF
    Deep learning-based methods have proven to be the best performers when it comes to object recognition cues both in images and tridimensional data. Nonetheless, when it comes to 3D object recognition, the authors tend to convert the 3D data to images and then perform their classification. However, despite its accuracy, this approach has some issues. In this work, we present a deep learning pipeline for object recognition that takes a point cloud as input and provides the classification probabilities as output. Our proposal is trained on synthetic CAD objects and is able to perform accurately when fed with real data provided by commercial sensors. Unlike most approaches, our method is specifically trained to work on partial views of the objects rather than on a full representation, which is not the representation of the objects as captured by commercial sensors. We trained our proposal with the ModelNet10 dataset and achieved a 78.39% accuracy. We also tested it by adding noise to the dataset and against a number of datasets and real data with high success.This work has been funded by the Spanish Government TIN2016-76515-R grant for the COMBAHO project, supported with Feder funds. It has also been supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887

    A Voxelized Fractal Descriptor for 3D Object Recognition

    Get PDF
    Currently, state-of-the-art methods for 3D object recognition rely in a deep learning-pipeline. Nonetheless, these methods require a large amount of data that is not easy to obtain. In addition to that, the majority of them exploit features of the datasets, like the fact of being CAD models to create rendered representation which will not work in real life because the 3D sensors provide point clouds. We propose a novel global descriptor for point clouds which takes advantage of the fractal dimension of the objects. Our approach introduces many benefits, such as being agnostic to the density of points of the sample, number of points in the input cloud, sensor of choice, and noise up to a level, and it works on real life point cloud data provided by commercial sensors. We tested our descriptor for 3D object recognition using ModelNet, which is a well-known dataset for that task. Our approach achieves 92.84% accuracy on the ModelNet10, and 88.74% accuracy on the ModelNet40.This work was supported in part by the Spanish Government, with Feder funds, under Grant PID2019-104818RB-I00, and in part by the Spanish Grants for Ph.D. studies under Grant ACIF/2017/243 and Grant FPU16/00887

    UrOAC: Urban Objects in Any-light Conditions

    Get PDF
    In the past years, several works on urban object detection from the point of view of a person have been made. These works are intended to provide an enhanced understanding of the environ- ment for blind and visually challenged people. The mentioned approaches mostly rely in deep learning and machine learning methods. Nonetheless, these approaches only work with direct and bright light, namely, they will only perform correctly on daylight conditions. This is because deep learning algorithms require large amounts of data and the currently available datasets do not address this matter. In this work, we propose UrOAC, a dataset of urban objects captured in a range of different lightning conditions, from bright daylight to low and poor night-time lighting conditions. In the latter, the objects are only lit by low ambient light, street lamps and headlights of passing-by vehicles. The dataset depicts the following objects: pedestrian crosswalks, green traffic lights and red traffic lights. The annotations include the category and the bounding-box of each object. This dataset could be used for improve the performance at night-time and under low-light conditions of any vision-based method that involves urban objects. For instance, guidance and object detection devices for the visually challenged or self-driving and intelligent vehicles.This work has been supported by the Spanish Government PID2019-104818RB-I00 Grant, co-funded by EU Structural Funds

    Accurate and efficient 3D hand pose regression for robot hand teleoperation using a monocular RGB camera

    Get PDF
    In this paper, we present a novel deep learning-based architecture, which is under the scope of expert and intelligent systems, to perform accurate real-time tridimensional hand pose estimation using a single RGB frame as an input, so there is no need to use multiple cameras or points of view, or RGB-D devices. The proposed pipeline is composed of two convolutional neural network architectures. The first one is in charge of detecting the hand in the image. The second one is able to accurately infer the tridimensional position of the joints retrieving, thus, the full hand pose. To do this, we captured our own large-scale dataset composed of images of hands and the corresponding 3D joints annotations. The proposal achieved a 3D hand pose mean error of below 5 mm on both the proposed dataset and Stereo Hand Pose Tracking Benchmark, which is a public dataset. Our method also outperforms the state-of-the-art methods. We also demonstrate in this paper the application of the proposal to perform a robotic hand teleoperation with high success.This work has been supported by the Spanish Government TIN2016-76515R Grant, supported with Feder funds. This work has also been supported by a Spanish grant for PhD studies ACIF/2017/24

    RoboToy Demoulding: Robotic Demoulding System for Toy Manufacturing Industry

    Get PDF
    Industrial environments and product manufacturing processes are currently being automated and robotized. Nowadays, it is common to have robots integrated in the automotive industry, robots palletizing in the food industry and robots performing welding tasks in the metal industry. However, there are many traditional and manual sectors out of date with technology, such as the toy manufacturing industry. This work describes a new robotic system able to perform the demoulding task in a toy manufacturing process, which is a tedious labor-intensive and potentially hazardous task for human operators. The system is composed of specialised machinery about the rotational moulding manufacturing process, cameras, actuators, and a collaborative robot. A vision-based algorithm makes this system capable of handling soft plastic pieces which are deformable and flexible during demoulding. The system reduces the stress and potential injuries to human operators, allowing them to perform other tasks wit h higher dexterity requirements or relocate to other sub-tasks of the process where the physical effort is minor

    A Hand Motor Skills Rehabilitation for the Injured Implemented on a Social Robot

    Get PDF
    In this work, we introduce HaReS, a hand rehabilitation system. Our proposal integrates a series of exercises, jointly developed with a foundation for those with motor and cognitive injuries, that are aimed at improving the skills of patients and the adherence to the rehabilitation plan. Our system takes advantage of a low-cost hand-tracking device to provide a quantitative analysis of the performance of the patient. It also integrates a low-cost surface electromyography (sEMG) sensor in order to provide insight about which muscles are being activated while completing the exercises. It is also modular and can be deployed on a social robot. We tested our proposal in two different facilities for rehabilitation with high success. The therapists and patients felt more motivation while using HaReS, which improved the adherence to the rehabilitation plan. In addition, the therapists were able to provide services to more patients than when they used their traditional methodology.This work was funded by a Spanish Government PID2019-104818RB-I00 grant, supported by Feder funds. It was also supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887

    A new automatic method for demoulding plastic parts using an intelligent robotic system

    Get PDF
    Nowadays, there are many different industrial processes in which people spend several hours performing tedious and repetitive tasks. Furthermore, most of these processes involve the manipulation of dangerous materials or machinery, such as the toy manufacturing, where people handle ovens with high temperatures and make weary physical effort for a long period of time during the process. In this work, it is presented an automatic and innovative collaborative robotic system that is able to deal with the demoulding task during the manufacturing process of toy dolls. The intelligent robotic system is composed by an UR10e robot with a RealSense RGB-D camera integrated which detects the pieces in the mould using a developed vision-based algorithm and extracts them by means of a custom gripper located and the end of the robot. We introduce a pipeline to perform the demoulding task of different plastic pieces relying in the use of this intelligent robotic system. Finally, to validate this approach, the automatic method has been successfully implemented in a real toy factory providing a novel approach in this traditional manufacturing process. The paper describes the robotic system performance using different forces and velocities, obtaining a success rate of more than 90% in the experimental results.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This work has been carried out within the scope of an Industrial PhD at AIJU in the context of the SOFTMANBOT Project, with European funding from the Horizon 2022 research programme (G.A 869855). In addition, it has been supported by the UAIND21-06B grant of the University of Alicante

    Accurate Multilevel Classification for Wildlife Images

    Get PDF
    The most common approaches for classification rely on the inference of a specific class. However, every category could be naturally organized within a taxonomic tree, from the most general concept to the specific element, and that is how human knowledge works. This representation avoids the necessity of learning roughly the same features for a range of very similar categories, and it is easier to understand and work with and provides a classification for each abstraction level. In this paper, we carry out an exhaustive study of different methods to perform multilevel classification applied to the task of classifying wild animals and plant species. Different convolutional backbones, data setups, and ensembling techniques are explored to find the model which provides the best performance. As our experimentation remarks, in order to achieve the best performance on the datasets that are arranged in a tree-like structure, the classifier must feature an EfficientNetB5 backbone with an input size of px, followed by a multilevel classifier. In addition, a Multiscale Crop data augmentation process must be carried out. Finally, the accuracy of this setup is a 62% top-1 accuracy and 88% top-5 accuracy. The architecture could benefit for an accuracy boost if it is involved in an ensemble of cascade classifiers, but the computational demand is unbearable for any real application.This work was funded by the Spanish Government PID2019-104818RB-I00 grant, supported with FEDER funds. It was supported by Spanish grants for PhD studies ACIF/2017/243 and FPU16/00887

    Three-dimensional reconstruction using SFM for actual pedestrian classification

    Get PDF
    In recent years, the popularity of intelligent and autonomous vehicles has grown notably. In fact, there already exist commercial models with a high degree of autonomy as regards self-driving capabilities. A key feature for this kind of vehicle is object detection, which is commonly performed in 2D space. This has some inherent issues as an object and the depiction of such an object would be classified as the actual object, which is inadequate since urban environments are full of billboards, printed adverts and posters that would likely make these systems fail. In order to overcome this problem, a 3D sensor could be leveraged, although this would make the platform more expensive, energy inefficient and computationally complex. Thus, we propose the use of structure from motion to reconstruct the three-dimensional information of the scene from a set of images, and merge the 2D and 3D data to differentiate actual objects from depictions. As expected, our approach is able to work with a regular color camera. No 3D sensors whatsoever are required. As the experiments confirm, our approach is able to distinguish between actual pedestrians and depictions of them more than 87% of times in synthetic and real-world tests in the worst scenarios, while the accuracy is of almost 98% in the best case.This work was funded by a Spanish Government PID2019-104818RB-I00 grant, supported by Feder funds. It was also supported by Spanish grants for Ph.D. FPU16/00887. Experiments were made possible by a generous hardware donation from NVIDIA

    3D object detection with deep learning

    Get PDF
    Finding an appropriate environment representation is a crucial problem in robotics. 3D data has been recently used thanks to the advent of low cost RGB-D cameras. We propose a new way to represent a 3D map based on the information provided by an expert. Namely, the expert is the output of a Convolutional Neural Network trained with deep learning techniques. Relying on such information, we propose the generation of 3D maps using individual semantic labels, which are associated with environment objects or semantic labels. So, for each label we are provided with a partial 3D map whose data belong to the 3D perceptions, namely point clouds, which have an associated probability above a given threshold. The final map is obtained by registering and merging all these partial maps. The use of semantic labels provide us a with way to build the map while recognizing objects.This work has been supported by the Spanish Government TIN2016-76515-R Grant, supported with Feder funds, and by grant of Vicerrectorado de Investigación y Transferencia de Conocimiento para el fomento de la I+D+i en la Universidad de Alicante 2016
    corecore